Under five mortality has been a major health concern in Sub-Saharan Africa. Ethiopia, being a Sub-Saharan African country,has a mojor concern in relation to Child death but has achieved modest progress toward achieving the Millennium Development Goals (MDGs) of reducing under-5 deaths. Despite this, Ethiopia’s new-born and under-5 death rates are at 30 and 50 per 1000, respectively, far exceeding the SDG targets of 12 and 25 deaths per 1000 for new-borns and under-5 deaths. Therefore, The aim of this study is to study descriptive analysis and estimation. Estimation of U5M is conducted using Space-time smoothing model. Logistic regression model is then conducted to identify risk factors for Under five death.
For this study, the Ethiopian Demographic and Health Survey (EDHS) 2016 survey is bieng used. The EDHS survey is a national household survey conducted by the USAID.For the 2016 EDHS, 15,683 women ages 15-49, and 12,688 men ages 15-59 were successfully interviewed from 16,650 households. In addition to commonly covered topics in DHS such as child and maternal health, family planning, nutrition, health behavior and knowledge, health care access, and child immunization, blood samples were collected from consenting individuals for the presence of HIV and AIDS, and anemia.
We have tried to incorporate six risk factors for this study. Variables have been selected from previous studies and literature.
Variables used are
These variables are selected as to represent various socio-economic, demographic, clinical and family planning related risk factors.
The data pre processing exercise in general contains re factoring variables, group values and merge tables as well.
library('readstata13')
library('data.table')
library('tidyverse')
library('knitr')
library('kableExtra')
library('gtsummary')
library('table1')
library('SUMMER')
library('ggplot2')
library('gridExtra')
library('sp')
library('rigr')
library('ggplot2')
library('ggpubr')
library('dbplyr')
library('boot')
library('table1')
The original was in a .dta format (Stata file). To read this we require the library readstata13 to read the data. Hence
The data contains more than 1287 columns.
Once we load the data, we have to rename variables for convenience. Here, we need to filter data points where difference between date of interview and date of birth is less than 60 months (Five years cohort). This variable is encoded as b19 in the data.
df <- setDT(df_1)
df <- df[(b19 < 60)]
Once we replace all the relevant samples, we can rename and keep the relevant data for modeling.
df <- df[ ,.(region = v024, clust_no = v001, bir_int = b11, num_anc = m14, fac_del = m15, outcome = b5,type_res = v025, size_child = m18)]
Finally,the dataset after the completion of the preprocessing would look like the following.
| region | clust_no | bir_int | fac_del | type_res | size_child | outcome |
|---|---|---|---|---|---|---|
| 1 | 1 | 0 | 0 | 1 | 3 | 0 |
| 1 | 1 | 0 | 0 | 1 | 3 | 0 |
| 1 | 1 | 0 | 0 | 1 | 4 | 0 |
| 1 | 1 | 1 | 0 | 1 | 3 | 1 |
| 1 | 1 | 0 | 0 | 1 | 4 | 0 |
| 1 | 1 | 0 | 0 | 1 | 3 | 0 |
Once the data is preprocessed, mortality estimation has been done using space time model. The space time model uses bayesian statistical model to estimate prevalence of U5M across space and time. It does interpolation analysis to refine estimates through proximity over space and time.
Once the national and sub national mortality estimation is obtained through the Space-time model, we analyzed the important nd identified attributable risk factors contributing for under-five mortality estimation. Inference is made if the contributing factors are significant and are associated with the death of Children.
In our analysis, we performed exploratory data analysis to visualize the distribution of mortality among demographic entities, specifically region, as can be seen from figure 1.
Fig 1: Bar graph
Descriptive analysis is performed for the association between the response and explanatory variables. Table 2 below shows the association of the explanatory variables to the outcome variable.